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Abstract 

Background: Tine utility of self-report measures of physical activity (PA) in youth can be greatly enhanced by calibrating 
self-report output against objectively measured PA data. 

This study demonstrates the potential of calibrating self-report output against objectively measured physical activity (PA) 
in youth by using a commonly used self-report tool called the Physical Activity Questionnaire (PAQ). 

Methods: A total of 148 participants (grades 4 through 12) from 9 schools (during the 2009-2010 school year) wore an 
Actigraph accelerometer for 7 days and then completed the PAQ. Multiple linear regression modeling was used on 70% 
of the available sample to develop a calibration equation and this was cross validated on an independent sample of 
participants (30% of sample). 

Results: A calibration model with age, gender, and PAQ scores explained 40% of the variance in values for the 
percentage of time in moderate-to-vigorous PA (o/„MVPA) measured from the accelerometers (o/„MVPA= 14.56 - 
(sex*0.98) - (0.84*age) -i- (1 .01 *PAQ)). When tested on an independent, hold-out sample, the model estimated o/„MVPA 
values that were highly correlated with the recorded accelerometer values (r = .63) and there was no significant 
difference between the estimated and recorded activity values (mean diff. = 25.3 ± 1 8.1 min; p = .1 7). 

Conclusions: These results suggest that the calibrated PAQ may be a valid alternative tool to activity monitoring 
instruments for estimating o/^MVPA in groups of youth. 
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Background 

The development of more feasible and accurate methods of 
assessing physical activity behavior is an important public 
health research priority [1-4]. Objective monitoring devices 
have advantages but the high cost and burden of data pro- 
cessing make them impractical for large-scale applications 
[5-7]. Subjective (survey-based) tools are inexpensive and 
easy to use but they suffer from questionable validity [8]. 
Objective measures are often used to validate less accurate 
measures, such as subjective instruments, but this does not 
direcdy improve the accuracy or precision of the self-report 
instrument. If simple, easy-to-use self-report instruments 
could be calibrated against more accurate assessments, it 
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might be possible to generate equivalent estimates of PA in 
a more efficient and cost-effective manner. 

Calibration is a commonly accepted measurement prac- 
tice that allows data to be scaled or adjusted to produce 
more accurate and usable estimates [9]. Considering the 
complexities of classifying and coding physical activity, it is 
actually quite naive to expect raw, uncalibrated, self-report 
estimates to even come close to individual-level estimates 
of PA [10]. However, questionnaires have been shown to be 
able to rank people according to their activity and estimate 
group-level PA in young populations with reasonable accur- 
acy [6,10]. 

The use of measurement error models and calibration 
procedures are common in diet-related research so it is 
surprising that there are so few examples of measure- 
ment error studies [11] and calibration applications in 
studies of physical activity [12]. However, the growing 
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interest in this topic is clear as illustrated by the confer- 
ence on self-report measures, sponsored by the National 
Institutes of Health that highlighted the value of such 
measures and the need for continued refinement [13]. 
This topic has also been addressed in recent epidemi- 
ology research studies that demonstrated the importance 
of regression calibration for self-reported physical 
activity [14,15]. A number of other studies have also 
emphasized the importance of accurate self-report mea- 
sures for epidemiology research, large school-based 
projects, and surveillance applications [14,16-18]. This 
paper helps address the need for more accurate self-report 
measures of youth by developing a robust calibration 
method for a commonly used physical activity self-report 
instrument in youth — the Physical Activity Questionnaire 
(PAQ) — which has been used to measure activity in chil- 
dren (PAQ-C) and adolescents (PAQ-A) [19,20]. 

The PAQ was selected because of its well-established 
psychometric properties and desirable measurement 
characteristics compared to other self-report measures 
for youth [21,22]. A review by Biddle and colleagues 
identified the PAQ as one of the most promising self- 
report tools available in the field [23]. Although the 
PAQ has shown good utility for field-based research 
[20,24-29], a limitation is that the outcome score is not 
readily interpretable [23]. The PAQ items are scored 
using ordinal scales (1-5 scale) and the outcome meas- 
ure is computed as a simple mean of the individual 
items [30]. This makes it difficult to relate the PAQ 
score to established public health guidelines or to quan- 
tify levels of physical activity. 

The purpose of this study was to develop and evaluate 
a calibration model that would allow raw PAQ scores to 
be converted to a more useful indicator of moderate-to- 
vigorous physical activity (MVPA) (namely, percentage 
of time in MVPA and/or minutes of MVPA) using an 
accelerometry-based activity monitor as the criterion 
measure. Accelerometers provide an objective indicator 
of free-living physical activity that can be temporally 
linked to data from a self-report tool [31]. Other "gold- 
standard" measures of physical activity (e .g. doubly la- 
beled water, indirect calorimetry, and direct observation) 
cannot satisfy these objectives, and, therefore, are not 
well suited to this type of application. 

Methods 

Participants 

The data for the study were collected as part of a school- 
based study to monitor activity that was conducted in fall 
2009 and spring 2010. Participants (n = 261; 172 collected 
in the fall and 89 in the spring) were recruited from 12 
schools (9 elementary and 3 secondary schools) from a 
small Midwestern community, Kearney, Nebraska, USA. 
Using a cluster sampling technique, 12 classrooms, grades 



3 through 5, were randomly identified for sampling, along 
with 12 secondary-level classrooms. Participants were re- 
quired to return both an assent and consent forms signed 
in order to participate in the study. The study was approved 
by the University of Nebraska Kearney Institutional Review 
Board. 

Instruments 

Physical activity questionnaire 

The PAQ, a self-administered 7-day (previous week) recall 
questionnaire, was designed to assess overall participation 
in PA. The PAQ-C was originally developed for use with 
elementary school children but was later adapted for mid- 
dle school and high school youth (PAQ-A). The first item is 
an activity checklist that includes several common sports, 
leisure activities, and games. The developers of the PAQs 
said this item acts as an important memory cue, which 
might suggest it was not devised to get a precise indicator 
of activity [20]. The remaining items assess activity during 
specific periods of the day, including physical education 
(PE) class, recess (included only in the PAQ-C), lunch, im- 
mediately after school, evening, and the weekend, as well as 
two additional questions that assess overall activity patterns 
during the week. Each question is scored using a scale that 
ranges from 1 to 5; the higher score indicates a higher level 
of activity. The average of the items is used to create the 
final PAQ summary score. Previous studies have supported 
the validity of the PAQ instrument for assessing general 
levels of physical activity [20,24-29]. 

Actigraphi GT1M 

The ActiGraph GTIM (Actigraph, Pensacola, Florida, 
USA) activity monitor was selected as the criterion 
measure in the present study because of its wide accept- 
ance and use in physical activity assessment research 
[32,33]. This activity monitor is a small, uniaxial acceler- 
ometer that is attached by a belt to the right side of the 
waist to capture acceleration movements from 0.05 to 
2.0 g. It has a frequency band limit of 0.25-2.5 Hz. The 
GTIM uses a sampling rate of 30 Hz (meaning 30 mea- 
surements per second) and has 1 megabyte of memory. 
The available cut points for determining levels of MVPA 
were developed using an older version of Actigraph 
(CSA 7164), but those cut points can also be applicable 
to the GTIM as well [34]. 

Procedures 

Students who returned a completed informed consent and 
assent form were included in the study. Data on weight 
and height were obtained by standardized procedures and 
used to calculate body mass index (BMI). BMI percentiles 
were computed and described using Centers for Disease 
Control and Prevention growth charts (normal: BMI 
<85th percentile; overweight: BMI >85th and < 95th 
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percentile; and obese: BMI >95th percentile). Upon collec- 
tion of anthropometric information participants were 
asked to wear an ActiGraph accelerometer for 7 consecu- 
tive days, and instructed to remove the monitor only 
during water-based activities. The accelerometer was ini- 
tialized to store activity counts every 30 seconds (i.e. for 
30-second epochs). 

After 7 days of wearing the accelerometer, participants 
were asked to return the monitor and complete a PAQ-C 
or a PAQ-A. Students in grades 3 through 5 completed a 
PAQ-C in their regular classroom while being supervised 
by the classroom teacher. Adolescents in secondary grades 
completed the PAQ-A during their PE class while being su- 
pervised by the PE teacher. 

Data processing 

The average PAQ score (1-5 scale) was defined as the self- 
reported activity index using either the PAQ-C or the 
PAQ-A (including the recess item for chOdren but not for 
adolescents). This score was computed using standard PAQ 
procedures as described by the developers of the question- 
naires [30]. 

The Actigraph data were downloaded using the software 
provided by the manufacturer (version 5.0, Actigraph, 
Pensacola, Florida), and imported into SAS v9.2 for data 
processing and screening. Strict compliance criteria for the 
accelerometer were established to ensure appropriate cali- 
bration. A day was defined as extending from 8 a.m. to 9 p. 
m. to minimize the dilution of activity due to misclassifica- 
tion of awal<e time [35]. For a day to be deemed valid, par- 
ticipants had to have had >70% of valid data per day 
(equivalent to 9.0 hours a day); non-wear time was identi- 
fied by continuous bouts of 90 minutes at 0 counts per mi- 
nute allowing for 2 consecutive minutes of counts per 
minute from 0 to 100 [36,37]. There is no consensus on the 
most appropriate method to handle non-wear time how- 
ever, we favored a longer non-wear bout criteria to account 
for extended periods of sitting (i.e., continuous bouts of 
counts equal to 0) that can occur during regular class time. 
For overall weekly PA levels to be counted, participants had 
to have at least 4 valid days of data (3 week days and 1 
weekend day) [38]. Counts were converted to physical ac- 
tivity estimates using scaled (i.e., 30-seconds) age-specific 
cut points for the Actigraph [32]. A standard intensity def- 
inition for youth MVPA was used and set at >4 METS. 

The average percentage of time spent in MVPA was com- 
puted as the number of minutes in MVPA divided by the 
number of minutes of wear time, and this was done separ- 
ately for weekdays (o/„MVPA weekday) and weekend days 
(o^MVPA weekend day). 

Statistical analyses 

Data analyses were computed separately for the calibration 
and cross-validation samples. These two groups were 



randomly selected and defined to represent 70% (calibra- 
tion; n = 103) and 30% (cross validation; n = 45) of the full 
sample. Separate one-way ANOVAs were done to examine 
differences between age groups and to evaluate differences 
between calibration and cross-validation groups. Differ- 
ences in categorical outcomes were assessed using Pearson 
chi-square tests. 

For the calibration analyses, the average percentage of 
time spent in MVPA (%MVPA) was defined as the 
dependent variable. This outcome measure was selected 
for the calibration phase (as opposed to MVPAmin) be- 
cause it is less likely to be influenced by sample-specific 
school schedules. For example, the number of MVPA 
minutes would be directly influenced by the frequency 
and duration of active periods during school time but 
the percentage of time in MVPA would not be influ- 
enced to the same degree. Another advantage of using 
MVPA is that it can also minimize possible differences 
due to accelerometer wear time. The use of o/„MVPA is 
more abstract when compared to minutes of PA but it is 
expected to improve the external validity of the resulting 
PAQ calibration equation. Once the percentage esti- 
mates are determined, the weekly estimated minutes of 
MVPA can be easily computed by multiplying the pre- 
dicted daily MVPA percentage by the total available mi- 
nutes per week. 

Multivariate linear regression was used to determine 
the relationship between the PAQ outcome score and 
the average daily percentage of time spent in MVPA 
(%MVPA) recorded by the Actigraph (calibration). Model 
fit was evaluated based on the model values and 
Akaike Information Criterion (AIC) values [39], and the 
estimated |3 coefficients from each independent variable 
in the model, including the PAQ score. The root mean 
square error (RMSE) (also known as standard error of 
estimate (SEE) was used as an indicator of model accur- 
acy and computed as the square root of the mean square 
residuals from the overall regression ANOVA table. 
Model precision was examined using the Breusch-Pagan 
test for heteroscedacity of residuals. 

For cross validation, the model was applied to the 
remaining subsample of participants (30%) that were not 
included in the calibration analyses (an independent 
sample). Estimated daily %MVPA values were converted 
to weekly minutes of MVPA by multiplying the model- 
based o/„MVPA values by the total weekly minutes avail- 
able for physical activity. For the present study, we as- 
sumed that youth would have approximately 13 hours a 
day to potentially be active (24 hours minus 11 hours of 
sleep/rest). Previous studies have reported average sleep 
time for children as high as 10.6 hours [35] so this is a 
reasonable approximation of available activity time. 
Thus, the 94MVPA estimates were multiplied by a value 
of 5,460 minutes (13 hours x 7 days = 5,460 minutes) to 
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obtain estimates of weekly MVPA minutes. This as- 
sumption may not be tenable for all youth but (as de- 
scribed above) the approach enabled us to produce 
estimates of MVPA that account for variability in activity 
time allocations during school time (e.g. recess duration 
and PE duration). 

The validity of the calibration algorithm was first eval- 
uated using a paired t-test to examine the overall differ- 
ence between the two instruments and the proportion of 
variance explained (R^). The appropriateness of the de- 
veloped calibration equation was discussed based on the 



RMSE (for group-level assessment) and limits of agree- 
ment (individual-level assessment) computed as the 
[mean difference ± (1.96 x standard deviation)] from the 
mean difference. Significance level was set at .05 and all 
analyses were conducted using SAS v9.2. 

Results 

The final sample used for analysis (based upon an exam- 
ination of participant compliance with the study proto- 
col) included 148 youths (76 boys and 72 girls). The 
average wear time was 754.8 ± 24.7 minutes, which is 



Table 1 Descriptives for the calibration and cross-validation samples stratified by age group 


Age group 


Younger 


Older 


Combined 




Calibration sample 


n = 71 


n = 32 


n = 103 






M ± SD 


M ± SD 


M ± SD 


P 


Age (y) 


9.7 ±1.1 


1 3.3 ± 1 .3 


1 0.8 ± 2.0 


<.001 


Gender (% male)^ 


4/ .y 


40. y 


4/ .D 


.yzu 


Height (cm) 


1/11 n -1- "7 n 

1 4 1 .u ± /.y 


1 i^r\ c ^ 1 1 "7 
1 OU.5 ±11./ 


1 46. 0 ± 1 Z./ 


<.001 


vveignt (.Kgj 


3/ .4 ± y.o 


ou./ ± 1 y.o 


AA(^^ 1 7 T 

44.D ± 1 /.z 


<.UU 1 


DiVii (Kg/m ; 


1 8.6 ± 3.8 


23.2 ± 5.6 


20.1 ± 4.9 


<.001 


UDese (yo) 


1 2.7 


28.1 


1 7.4 


.055 


Spring (%) 


57.8 


55.2 


57.3 


.800 


MVPAmin irnin) 


60.7 ± 23.1 


37.6 ± 20.1 


53.5 ± 24.6 


<.001 


IVlVrA WeeKQBy y'/Oj 


O.U ± D.D 


J.U ± Z.J 


/ . 1 ± i.i 


<.UU 1 


iViVrA weeKenu (toJ 


8.0 ± 4.2 


4.9 ± 5.3 


7.0 ± 4.8 


<.001 


r\r\ l,L[JI 1 1/ Udyy 


'-\ 1 ri jL I 


ADQ J? + 1 ^7 Q 


1 .Z IL 1 ZJ.D 


nfin 

\jO\J 


AM wear time (d) 


6.0 ± 0.9 


5.6 ±0.9 


5.9 ±0.9 


.025 


PAQ (0-5 scale) 


3.2 ±0.7 


2.8 ±0.6 


3.1 ±0.7* 


.005 


Cross-validation sample 


n = 32 


n = 13 


n = 45 






M±SD 


M±SD 


M±SD 


P 


Age (y) 


8.4 ±3.1 


13.1 ±1.0 


1 0.6 ± 1 .9 


<.001 


Gender (% male)^ 


62.5 


53.9 


60 


.590 


Height (cm) 


1407 ±84 


1 57.0 ± 10.9 


1453 ±117 


<.001 


Weight (kg) 


37.3 ± 94 


46.9 ± 11.0 


40.1 ± 10.7 


.001 


BMI (kg/m^) 


187 ±3.6 


18.9 ±2.8 


18.8 ±3.3 


.890 


Obese (%)^ 


0 


0 


0 


na 


Spring (%)^ 


46.9 


61.5 


51.1 


.370 


MVPAmin (min) 


614±21.7 


39.0 ± 194 


557 ±222 


<.001 


MVPA week day (%) 


84 ±3.1 


5.5 ±2.6 


7.6 ± 3.2 


.005 


MVPA weekend day {%) 


84 ±47 


4.0 ±3.0 


7.1 ±47 


.003 


AA (cpm/day) 


49 1.0 ± 100.5 


434.6 ± 125.1 


4747 ± 109.8 


.120 


AM wear time (d) 


6.2 ± 0.8 


5.5 ± 1.1 


6.0 ± 0.9 


.014 


PAQ (0-5 scale) 


3.5 ± 0.5 


2.9 ±0.5 


3.3 ± 0.6 


<.001 



p-values relate to age group comparison tests. 

^Tested using Pearson chi-square test. 

*Significantly different than the cross-validation sample. 

M ± SD = mean ± standard deviation. 

na = not applicable. 

AA = Average activity. 

AM = Activity monitor. 
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close to the hypothesized 13 hours per day (780 minutes). 
Overall descriptive analyses were conducted for the full 
sample and separately for two different groups: calibra- 
tion (n = 103) and cross-validation samples (n = 45). The 
two groups were similar in their demographic character- 
istics and each included a balanced sample of Fall and 
Spring observations in order to account for possible sea- 
son differences in levels of PA. 

Overall, students in the older age group were taller, 
heavier, and less active than their younger peers. These 
differences were consistent among the calibration and 
cross-validation samples, and they were supported by 
nonsignificant differences between the calibration and 
cross-validation in all variables (p > .05) except in PAQ 
scores [F (1,146) = 4.13, p = .04)] (Table 1). 

Calibration 

A multivariate linear regression model was fit to the 
calibration sample with three independent variables: 
PAQ score, age (in years; no decimal places), and gender 
(boys = 1; girls = 2). The 94MVPA was defined as the 
dependent variable. BMI was not considered because it 
might not be feasible to obtain BMI scores when com- 
puting youth activity levels from large samples (espe- 
cially in school settings). Nevertheless, the utility of BMI 
was examined and deemed to be nonsignificant {p = .26) 
when included in the calibration model. 

Examination of Spearman (for PAQ) correlation, re- 
vealed moderate and significant linear associations of 
o/„MVPA with PAQ scores (r,(103) = .35, p < .001). This 
supported the inclusion of this variable in the model and 
justified the decision to proceed with linear forms of the 
main independent variable. The final model explained 
40% of the variability in o/„MVPA [{R^ = .40; F (3,99) = 
22.10, p < .001)], and the estimated |3 coefficients for age 
(P = -0.84 ± 0.13; p<.00l) and PAQ ((3 = 1.01 ± 0.39; 
p = .01) variables were found to be significant predictors 
of 9<,MVPA. Gender approached significance (p = -0.98 ± 
0.51; p - .06), and was retained in the model to account 



for possible population differences between boys' and 
girls' activity (Figure 1). 

AIC values were computed for this model fit (with 
PAQ, age, and gender predictors) and other models with 
additional variables (namely BMI and interactions be- 
tween BMI, PAQ, age, and gender). The AIC value for 
the simple model was similar to AIC values for more 
complex models, suggesting that the simple model with 
PAQ, age, and gender predictors is reasonable. The final 
model for estimation of (daily) %MVPA was as follows: 

Daily^MVPA = 14.56-(sex * 0.98)-(0.84 * age) 
(1.01* PAQ) 

Sex was coded as "1" if male and "2" if female; age was 
coded in years (ranging from 8 y to 14y); PAQ was the 
average raw score with one decimal place. 

The overall accuracy of the model was equal to 2.54% 
(RMSE = 2.54%) and indicated a reasonable fit to the data 
(suggesting that the equation could estimate group-level 
o/„MVPA with an error of 2.54%). The Breusch- Pagan test 
(a test for heteroscedacity) showed that the error variability 
(precision) was consistent across different levels of acceler- 
ometer activity {X^ (8) = 10.7; p = .22). Calibration regres- 
sion coefficients are presented in Table 2. 

Cross validation 

Estimates of daily percentage of time in MVPA from the 
calibration model (94MVPA) were multiplied by 5,460 mi- 
nutes of weekly awake time to estimate total weekly mi- 
nutes of MVPA in the cross-validation sample (n = 45). 
Model-estimated values were compared to observed 
accelerometer-based values of MVPA. On average, the 
PAQ calibration equation produced similar accelerom- 
eter estimates of total minutes of MVPA (Mean diff. = 
25.3 ± 18.1 minutes (t (44) = 1.40, p = .17). Accelerometer 
and model-estimated minutes of weekly MVPA were 
moderately and significantly associated with each other, 
and the estimated scores explained 40% of the variability 




0 2 4 6 8 10 12 

Predicted %MVPA 

Figure 1 Relationship between accelerometer activity levels (recorded .^.IVIVPA) and predicted activity levels (predicted o/^MVPA) in the 
calibration sample. The solid line represents the best fit with the respective 95% confidence interval for the mean predicted values 
(dashed lines). 
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Table 2 Calibration parameters and model evaluation 
indices 





Estimate 


SE 


T 


P 


Model parameters 










ntercept 


14.56 


2.14 


6.81 


<.001 


Gender 


-0.98 


0.51 


-1.93 


.06 


Age 


-0.84 


0.13 


-6.74 


<.001 


PAQ 


1.01 


0.39 


2.58 


.01 


Model evaluation 












0.4 








RMSE 


2.54 








VI F'' 


1 .02-1 .05 








Breusch-Pagan 


10.7 






.22 



SE = standard error. 

RMSE = root mean square error. 

^VIF = variance inflation factor (range). 

Dependent variable = o^^MVPA (average per day). 

Gender: boys = 1 ; girls = 2. 

Age: in years (e.g. 10). 

in accelerometer minutes of weel<ly MVPA with an 
RMSE of 121.6 (7?^ = .40, f (1,43) = 28.71, < .001) 
(Figure 2). 

The Pearson correlation between absolute error and 
accelerometer estimates of MVPA was equal to -.24 
{p = .11), supporting the assumption of homogenous dis- 
tribution of error. Limits of agreement (LOA) were de- 
termined to examine individual and group-level error. 
The 95% confidence interval for group-level bias sug- 
gested that this error can range from between -6% and 
16% of accelerometer group estimates (deemed nonsig- 
nificant). Individual error ranged from between -56% 
and +69% of the accelerometer value (Table 3). 

Figure 3 provides an illustration of the relationship be- 
tween PAQ scores and estimated minutes of MVPA 
(min/week). Results are described for boys aged 9, 11, 
and 13 years. Each unit increase in the final PAQ score 



(1-5 scale) was associated with an increase of 55.1 mi- 
nutes of weekly MVPA. 

Discussion 

Self-report instruments have been used in many epi- 
demiology studies and have contributed to most of what 
is known regarding the relationship between physical ac- 
tivity and health [40]. Although objective instruments 
are now widely used, there is considerable need to im- 
prove the utility and accuracy of self-report measures. 
The low cost and ease of use make self-report measures 
the most feasible approach for assessing physical activity 
profiles in large and diverse groups of individuals [40]. 
The calibration procedures tested in this study provide a 
way to scale self-report data so it matches data obtained 
using objective measures. In this study, we calibrated the 
PAQ (a widely used self-report instrument in research 
with children) but the methods would have similar util- 
ity for other self-report instruments. The specific goal 
was to evaluate the validity of a simple calibration equa- 
tion to convert raw PAQ scores into a more meaningful 
outcome measure (minutes of MVPA per week) and de- 
termine if they can be used to predict group-level esti- 
mates of MVPA. 

The results support the utility of this method. The 
resulting model estimated objectively recorded activity 
with an error of 2.54%, and it explained 40% of the vari- 
ability in MVPA. A strength of the analytic approach is 
that the calibration equation was developed to predict 
the percentage of time spent in MVPA across the week. 
This value is then converted to minutes of MVPA by 
multiplying by total weekly minutes considered in this 
study (5,460 minutes of awake time, or 8 a.m. to 9 p.m., 
Monday through Sunday). This approach is more robust 
than directly estimating minutes of MVPA because it 
avoids potential error caused by future differences in the 
length of the typical day being considered (external 
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Table 3 Agreement between weekly minutes of MVPA obtained from the PAQ and accelerometer 





PAQ MVPA' 


Acc iVIVPA' 


IVIean Bias' 


95% CP 


LOA^ 


Estimate 


415.2± 113.3 


389.9 ± 155.3 


25.3 ± 121.2 


-11.2, 61.7 


-212.3, 262.9 



^ Mean ± standard deviation. 

^95% confidence interval for the average mean difference (PAQ - Acc}. 

^Limits of Agreement computed as: mean difference ±(1.96 *x standard deviation of mean difference). 



validity). The approach also minimizes any wear time 
differences between participants. 

The utility of this approach was demonstrated in the 
cross-validation analyses as reasonable measurement 
agreement was obtained when it was evaluated in an in- 
dependent sample. The 95% confidence interval for 
group mean differences indicated that group-level bias 
can range from -11.2 to 61.7 minutes of MVPA, equiva- 
lent to -6% and 16% of accelerometer estimates of 
weekly minutes of MVPA. This supports the ability of 
the PAQ algorithm to estimate group-level estimates of 
accelerometer activity. The results from this independ- 
ent sample are noteworthy because they demonstrate 
that the calibration algorithm is effective in estimating 
activity in a different group of individuals. Although the 
results are promising, there is clearly significant room 
for improvement in the accuracy of this type of 
calibration. 

As stated, a potential application for this type of cali- 
brated tool would be to use it in place of more expensive 
and cumbersome objective monitors. To evaluate the 
potential utility for this type of application, we retro- 
spectively identified youth who would meet public health 
guidelines (e.g. 60 minutes of daily MVPA) [41] based 
on both the PAQ and accelerometer data. These results 
revealed a moderate and significant degree of agreement 
(area under the curve = 0.79 ± 0.07, p < .001). Approxi- 
mately 65% of non-active individuals based on the 



accelerometer data were correctly identified through 
self-reported estimates (specificity = 65.4). Approxi- 
mately 74% of active individuals meeting guidelines with 
the accelerometer were correctly identified by the PAQ 
(sensitivity = 73.7). These results are reasonable, consid- 
ering the large discrepancies that have previously been 
reported between self-report and objective measures in 
past epidemiology studies. For example, Troiano et al. 
(2008) examined accelerometer data from the NHANES 
2003-2004 cohort and found that only 2.3% to 3.5% of 
adults met the physical activity guidelines for Americans 
(PAG A) [36]. Similarly, Tucker and colleagues [42] 
found that the prevalence of adults meeting the PAGA, 
based on accelerometry, was 9.6%, even as the estimate 
was 62.0% when activity was self-reported. The preva- 
lence rates of adults meeting the PAGA reported in 
these two studies based on accelerometer data were sub- 
stantially lower than PAGA compliance based on self- 
reported activity. These discrepancies with self-report 
data have been well chronicled, but with calibration ap- 
proaches similar to those demonstrated here it would be 
possible to model self-report data so they approximate 
the patterns and distribution from objective data. 

The approach presented here would provide reason- 
ably accurate group-level estimates. It is important to 
note, however, that we observed large individual bias 
ranging from -212.3 and 262.9 minutes or -56% to 
-1-69% of the accelerometer estimates of MVPA. Thus, it 




PAQ Score (1-5) 

Figure 3 Predicted minutes of MVPA (min/week) using different PAQ scores. Estimates were generated for three boys aged 9, 1 1, and 

13 years. The final estimated score was divided by 100 and multiplied by 5,460 minutes as a measure of weekly activity. For each PAQ score unit 

increase there was an increase of 55.1 minutes of weekly MVPA. 
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may be premature to apply the equation for individual 
estimation. However, this issue is not unique for our ap- 
plication. Current calibration equations used to process 
and summarize accelerometer data have also been 
shown to have limitations for estimating individual data. 

The results of the present study support the value and 
potential of this calibration approach, but it is important 
to consider the inherent differences between objective 
(e.g. Actigraph monitor) and subjective measures (e.g. 
self-report) of physical activity since it may directly ex- 
plain some of the findings. Accelerometry-based activity 
monitors (e.g. Actigraph) provide direct measures of 
limb acceleration and output raw counts accumulated 
per pre-defined unit of time. Subjective, self-report mea- 
sures, on the other hand, provide contextual information 
about PA behaviors that may not be associated with ac- 
celeration of the limbs. Both instruments are essentially 
measuring different aspects of the same underlying be- 
havior. Based on this, it is actually quite naive to expect 
that these two instruments can provide equivalent infor- 
mation. The advantage of calibration procedures demon- 
strated in this paper is that it is possible to establish 
quantitative links between the subjective reports and 
more objectively monitored data. The methodology has 
clear promise but refinements will be needed to enable 
more accurate estimates at the individual level. 

Aspects of the design and the nature of the measures 
may have limited our ability to fully calibrate the PAQ. 
The For example, the accelerometer data were collected 
using 30-second epochs and this may have obscured 
shorter and more intermittent bouts of activity. How- 
ever, this limitation is somewhat minimized when activ- 
ity is aggregated into MVPA, as in the present study 
[43]. Also relevant was the fact that the current calibra- 
tion utilized data collected across a full week rather than 
treating weekdays and weekends separately. It may be 
possible to create more effective calibration equations by 
directly matching the reported times with the data re- 
corded from the accelerometer. This was not possible in 
the present analyses because the purpose of the study 
was to calibrate the original PAQ as recommended by 
the developers. It is noteworthy that the present calibra- 
tion equation yielded reasonable group-level estimates 
despite these limitations. Nevertheless, the equation 
should be used with caution until more robust evalua- 
tions are performed. The developed equation, for ex- 
ample, should be tested on another group of individuals 
across different age groups. Despite the randomized dis- 
tribution of participants into calibration and cross- 
validation groups, no obese children were included in 
the cross-validation sample. The majority of our sample 
was composed of individuals 8 to 13 years old, and, 
therefore, the results should not generalize to older 
individuals. 



Conclusions 

The results demonstrate that the PAQ can be calibrated 
to provide accurate group-level estimates of MVPA. The 
findings presented here are specific to the PAQ, but 
similar approaches can be used to improve the utility of 
other self-report instruments. There is clear public 
health interest in improving self-report measures [13], 
and the calibration procedures shown here offer a way 
to get reasonable accuracy with a more feasible and 
cost-effective strategy. 
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